Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Long non-coding RNA-disease association prediction model based on semantic and global dual attention mechanism
Yi ZHANG, Gangsheng CAI, Zhenmei WANG
Journal of Computer Applications    2023, 43 (7): 2125-2132.   DOI: 10.11772/j.issn.1001-9081.2022060872
Abstract268)   HTML9)    PDF (781KB)(125)       Save

Aiming at the limitations of existing long non-coding RNA (lncRNA) -disease association prediction models in comprehensively utilizing interaction and semantic information of heterogeneous biological networks, an lncRNA-Disease Association prediction model based on Semantic and Global dual Attention mechanism (SGALDA) was proposed. Firstly, an lncRNA-disease-microRNA (miRNA) heterogeneous network was constructed based on similarity and known associations. And a feature extraction module was designed based on message passing types to extract and fuse the neighborhood features of homogeneous and heterogeneous nodes on the network, so as to capture multi-level interactive relations on the heterogeneous network. Secondly, the heterogeneous network was decomposed into multiple semantic sub-networks based on meta-paths. And a Graph Convolutional Network (GCN) was applied on each sub-network to extract semantic features of nodes, so as to capture the high-order interactive relations on the heterogeneous network. Thirdly, a semantic and global dual attention mechanism was used to fuse semantic and neighborhood features of the nodes to obtain more representative node features. Finally, lncRNA-disease associations were reconstructed by using the inner product of lncRNA node features and disease node features. The 5-fold cross-validation results show that the Area Under Receiver Operating Characteristic curve (AUROC) of SGALDA is 0.994 5±0.000 2, and the Area Under Precision-Recall curve (AUPR) of SGALDA is 0.916 7±0.001 1, both of them are the highest among AUROCs sand AUPRs of all the comparison models. It proves SGALDA’s good prediction performance. Case studies on breast cancer and stomach cancer further prove the ability of SGALDA to identify potential lncRNA-disease associations, indicating that SGALDA has the potential to be a reliable lncRNA-disease association prediction model.

Table and Figures | Reference | Related Articles | Metrics
circRNA-disease association prediction by two-stage fusion on graph auto-encoder
Yi ZHANG, Zhenmei WANG
Journal of Computer Applications    2023, 43 (6): 1979-1986.   DOI: 10.11772/j.issn.1001-9081.2022050727
Abstract305)   HTML9)    PDF (1805KB)(82)       Save

Most existing computational models for predicting associations between circular RNA (circRNA) and diseases usually use biological knowledge such as circRNA and disease-related data, and mine the potential association information by combining known circRNA-disease association information pairs. However, these models suffer from inherent problems such as sparsity and too few negative samples of networks composed of the known association, resulting in poor prediction performance. Therefore, inductive matrix completion and self-attention mechanism were introduced for two-stage fusion based on graph auto-encoder to achieve circRNA-disease association prediction, and the model based on the above is GIS-CDA (Graph auto-encoder combining Inductive matrix complementation and Self-attention mechanism for predicting CircRNA-Disease Association). Firstly, the similarity of circRNA integration and disease integration was calculated, and graph auto-encoder was used to learn the potential features of circRNAs and diseases to obtain low-dimensional representations. Secondly, the learned features were input to inductive matrix complementation to improve the similarity and dependence between nodes. Thirdly, the circRNA feature matrix and disease feature matrix were integrated into circRNA-disease feature matrix to enhance the stability and accuracy of prediction. Finally, a self-attention mechanism was introduced to extract important features in the feature matrix and reduce the dependence on other biological information. The results of five-fold crossover and ten-fold crossover validation show that the Area Under Receiver Operating Characteristic curve (AUROC) values of GIS-CDA are 0.930 3 and 0.939 3 respectively, the former of which is 13.19,35.73,13.28 and 5.01 percentage points higher than those of the prediction models based on computational model of KATZ measures for Human CircRNA-Disease Association (KATZHCDA), Deep Matrix Factorization for CircRNA-Disease Association (DMFCDA), RWR (Random Walk with Restart) and Speedup Inductive Matrix Completion for CircRNA-Disease Associations (SIMCCDA), respectively; the Area Under Precision-Recall curve (AUPR) values of GIS-CDA are 0.227 1 and 0.234 0 respectively, the former of which is 21.72, 22.43, 21.96 and 13.86 percentage points higher than those of the above comparison models respectively. In addition, ablation experiments and case studies on circRNADisease, circ2Disease and circR2Disease datasets, further validate the good performance of GIS-CDA in predicting the potential circRNA-disease association.

Table and Figures | Reference | Related Articles | Metrics